Relevancer: Finding and Labeling Relevant Information in Tweet Collections
نویسندگان
چکیده
We introduce a tool that supports knowledge workers who want to gain insights from a tweet collection, but due to time constraints cannot go over all tweets. Our system first pre-processes, de-duplicates, and clusters the tweets. The detected clusters are presented to the expert as so-called information threads. Subsequently, based on the information thread labels provided by the expert, a classifier is trained that can be used to classify additional tweets. As a case study, the tool is evaluated on a tweet collection based on the key terms ‘genocide’ and ‘Rohingya’. The average precision and recall of the classifier on six classes is 0.83 and 0.82 respectively. At this level of performance, experts can use the tool to manage tweet collections efficiently without missing much information.
منابع مشابه
Active Tweet Recommendation Based on User Interest Profiles
The rapid growth of Twitter has made it one of the most popular information sources of current affairs. Twitter users gather information about their topics of interest through their followees’ posts or by searching for relevant posts. However, users are often overwhelmed by the large number of tweets which makes it difficult for them to find relevant and non-redundant information about their in...
متن کاملOpen Archive Toulouse Archive Ouverte (oatao) Overview of Inex 2014
INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2014 evaluation campaign, which consisted of three tracks: The Interactive Social Book Search Track investigated user information seeking behavior when intera...
متن کاملOverview of INEX 2014
INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2014 evaluation campaign, which consisted of three tracks: The Interactive Social Book Search Track investigated user information seeking behavior when intera...
متن کاملCollective Semantic Role Labeling for Tweets with Clustering
As tweets have become a comprehensive repository of fresh information, Semantic Role Labeling (SRL) for tweets has aroused great research interests because of its central role in a wide range of tweet related studies such as fine-grained information extraction, sentiment analysis and summarization. However, the fact that a tweet is often too short and informal to provide sufficient information ...
متن کاملQuickView: NLP-based Tweet Search
Tweets have become a comprehensive repository for real-time information. However, it is often hard for users to quickly get information they are interested in from tweets, owing to the sheer volume of tweets as well as their noisy and informal nature. We present QuickView, an NLP-based tweet search platform to tackle this issue. Specifically, it exploits a series of natural language processing ...
متن کامل